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(57) Abstract 

Improved methods for determining interactions between peptides or proteins and small molecules are disclosed. The invention methods 
can be used to screen libraries of either the small molecules or the proteins. In general, the methods comprise contacting an agent/1 igand 
complex consisting essentially of an agent to be tested for binding to a target protein coupled to a ligand capable of binding a proteinaceous 
ligand-binding domain with a first fusion protein comprising said target protein and a first complementary portion of a segregable protein; 
and a second fusion protein comprising a proteinaceous ligand-binding domain and a second complementary portion of said segregable 
protein; and detecting whether the first complementary portion and second complementary portion are brought into proximity. 
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SYSTEM TO DETECT SM ALL MOLECULFVPEPTTDE INTERACTION 
Technical Field 

The application is in the field of pharmaceutical development. The present 
5 invention provides reagents, cells and methods for determining interactions between 
protein targets and small molecules, so as to identify unidentified target and related 
proteins, as well as to identify agents, particularly small molecules, that bind to an 
identified target protein. 

10 Background Art 

A current focus of pharmaceutical development is the identification of agents 
(small molecules) which bind to a target protein and thus modulate a biological activity 
which is mediated by the target protein. However, few methods are available for 
identification of unknown target proteins that are bound by an agent with biological 
1 5 activity. Current technologies are generally based on directly determining whether an 
agent binds to a particular target protein or are based on determining whether the 
agent blocks a biological activity mediated by the target protein. Each of these current 
systems limits the number of agents and target proteins that can be effectively screened 
because of the nature and number of steps employed. Further, these methods typically 
focus on identifying the target in one species only. They do not provide a spectrum of 
viable targets whereby valuable information could be found concerning the nature of 
the interaction. 

Recently, an assay system was described in which molecular genetic techniques 
are used to identify protein/protein binding events (Fields, et al., US Patent No. 
5,283,173). The approach employed by Fields is a "two hybrid" protein system. In 
general, the Fields method uses two fusion proteins which are expressed within a single 
host cell, along with a reporter system to measure their interaction. The first fusion 
protein consists of one of the test proteins and a first portion of a transcription 
activator, e.g., a DNA binding domain. The second fusion protein consists of a second 
test protein and a second, complementary, portion of the transcription activator, e.g.. 
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an RNA polymerase activating domain. If the two test proteins bind to each other, the 
two transcription factor portions come into close proximity, thus reconstituting an 
active transcription factor, which then induces transcription of a gene encodtng a 

detectable marker. ( 

Although suit*!, for use in screening for protein/protein interacttons, the 
system ofFieids is not, in its current fornt, useable to screen non-protem agents, 
particularly stnall molecules, for biading to a single targe, protein, since Fields requues 
,ha, both of the test substances whose binding is to be assessed be proteins capabie of 

expression within a cell. 

Fields represents a special case of a generic approach to detection of protem- 
protein interactions involving chimeric fusion proteins. Another representee 
embodiment of such approaches is found in the s^dies of dimension of cytoplasm, 
domains of receptors by administration of small molecules as described by Spenser, 
DM et a/., Science (1993) 262:1019-1024. In another embodiment, a clnmenc 
15 construct oftheRaf-1 serine/threonine kinase protein with a peptide sequence .capable 
of dimension in the presence of the antibiotic coumermycin is described by Farrar, 
MA, al. Nature (1996) in press. In addition to these cellular systems, « has been 
found that (3-galactosidase can be segregated into complementary portions winch 
become operable when placed in proximity by virtue of association in the tetramenc 
2 0 mature protein described by Ullman. A. et al. , JUol Biol (1967) 24:339-343. 

For fi-galactosidase, this is a result of "^complementation" in winch an 
inactive N-terminal deletion and an active point mutant can complement m the 
tetramer. It will be apparent that the reconstituted activated "protein" need not anse 
from a single amino acid chain; an additional example relates to the SH2 domam 
2 5 binding to a phosphotyrosine domain which constitute an active P complex that was 

never a single protein. 

The present invention provides improvements to these techniques and adapts 
these systems for the interaction of agents other than protein, such as small molecules, 
with target proteins. 
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Disclosure of the Invention 

The invention provides a convenient system whereby libraries of target proteins 
can be screened for interaction with non-protein agents, whereby interaction profiles 
for proteins or small molecules can be obtained, and whereby libraries of small 
5 molecules can be screened for interaction with a target protein. In general, the 
invention provides a means to study the interaction of a target protein with a small 
molecule agent by utilizing two chimeric proteins ~ one comprising a first 
complementary proteinaceous portion of a segregable protein linked to a target protein 
and a second chimeric protein which comprises a second complementary portion of 

1 0 said segregable protein linked to a ligand-binding protein domain, where the activity of 
the segregable protein depends on proximity of the first and second complementary 
portions. The agent to be tested can then be supplied in a complex with a ligand 
capable of binding this ligand-binding domain. Thus, the second chimeric protein will 
now be effectively associated with the agent to be tested. Successful interaction of the 

1 5 agent with the target protein results in proximity of the two complementary portions 
required for constituting the active protein or complex. The activity of the complex or 
protein can then be detected by means which depend on its nature. Also, depending on 
the nature of the active complex or protein, the assays can be conducted intracellular^ 
or extracellularly. 

20 In one preferred embodiment, the invention comprises an extension of the two- 

hybrid system of Fields described above wherein one of the test components is supplied 
in the form of a complex with a ligand, which ligand binds to a ligand-binding domain 
contained on one of the two fusion proteins in the two-hybrid system. By using this 
complex, binding agents other than proteins can be coopted to participate in the 

2 5 interaction which results in the reconstitution of the transcription factor required for 

production of the detectable marker in the two-hybrid system. 

Thus, in this aspect, the invention is directed to a recombinant host cell 
modified to contain: i) a detectable marker expression unit which comprises an 
inducible promoter operably linked to a nucleotide sequence encoding a detectable 

3 0 marker, wherein the expression of said detectable marker is regulated by said inducible 
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promoter; U) a target peptide expression unit which comprises a nucleotide sequence 
encoding a fusion protein comprising a target peptide or protein and a first portion of a 
transcription activator protein selected from the group consisting of a DNA bmdmg 
domain and an RNA polymerase activation domain; and iii) a ligand binding domam 
5 expression unit which comprises a nucleotide sequence encoding a fusion protem 
comprising a ligand binding domain and a second portion of a transcription activator 
protein selected from the group consisting of a DNA binding domain and an RNA 
polymerase activation domain, whichever is not employed in (ii). 

The invention also employs other embodiments of the general theme of testing 
1 0 the ability of a target protein to bind to a small molecule agent by assessing a result of 
the proximity of two complementary portions of a segregable active protein wmch 
results from binding of agent to target. One additional embodiment involves takmg 
advantage of the importance of dimerization or oligomerization of cytoplasmic regums 
of cellular surface receptors in transducing signals. If each partner in the doner or 
1 5 oligomer can be brought into proximity, a signal will result which can then be detected 
using a suitable assay. Additional signaling events can occur by virtue of proxututy of 
complementary portions or subunits of other proteins as well. Indeed, some systems 
can be practiced extracellularly, such as that based on the association of two 
segregable portions of P -galactosidase. In every instance, the invention method 
2 o involves two fusion proteins, one comprising a first segregable portion of an active 
protein or complex and a target protein and the other fusion protein compnsmg a 
ligand-binding domain and the complementary portion of the segregable active protem 
or active complex. In all embodiments, the invention also employs a complex of an 
agent typically a small molecule, covalently attached to a ligand, wherein the hgand 
2 5 binds to the ligand-binding domain. The complex is readily introduced into the cell, .f 
required. The ligand secures the complex to the fusion protein containing the hgand- 
binding domain. In this way, a potential interaction between the agent and the target 

protein can be measured. 

The invention is thus directed to methods to determine whether target protons 
30 bind to agents, typically small molecules, and to methods to screen libraries of target 
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peptides with respect to a biologically active agent, or to screen libraries of candidate 
small molecules with respect to a known target protein. The invention is also directed 
to methods to obtain profiles of a small molecule agent against a panel of target 
proteins or vice-versa. The method of the invention can also be used as a basis for a 
5 system of classifying proteins based on their binding specificities. 

Brief Des cription of the Drawing s 

Figure 1 provides a diagrammatic representation of an intact transcription 
factor regulating expression of a reporter gene (A), the Fields two hybrid protein 

0 system (B) and the invention hybrid system (C). 

Figure 2 provides a detailed diagrammatic representation of one embodiment of 
the elements of the system of the present invention: (1) DNA binding domain of a 
transcription factor; (2) ligand binding domain fused to (1); (3) ligand bound by (2); 
(4) complexed agent to be tested for binding; (5) member of protein family to be tested 

5 for interaction with (4); (6) RNA polymerase activation domain of a transcription 
factor, fused to (5); (7) RNA polymerase; (8) inducible promoter sequence; (9) 
reporter gene sequence; (3)+(4) agent/ligand complex; (l)+(2) proteinaceous ligand 
binding domain expression unit; (5)+(6) target peptide expression unit; (8)+(9) 
detectable marker expression unit. 

0 

Modes for Carrying Out the Invention 

In detail, the present invention provides methods, reagents, and cells for use in 
determining whether an agent which is not necessarily a protein or peptide binds to a 
protein. Such methods, reagents, and cells can be used to screen a library of proteins 
5 to identify a protein bound by a particular agent or can be used to identify agents that 
bind a defined target peptide. The methods can also be used to provide profiles for 
various agents against panels of proteins using methods described in allowed U.S. 
Serial No. 08/177,673 and Yang etal, Nucleic Acids Res (1995) 23:1 152-1156 
incorporated herein by reference. 
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The detection of interaction of an agent with a target protein in every case 
relies on effecting a proximity of segregable complementary portions of a protein. By 
"segregable complementary portions of a protein" is meant that there are two 
proteinaceous segments which, when placed in proximity, are capable of effecting an 
5 activity, while when they are separated, the activity is not present. The proteinaceous 
complementary portions can originate from a single protein, as, for example, when they 
derive from a transcription activator, or may originate from related proteins as is the 
case described above for the P-galactosidase tetramer, or may be different proteins, 
such as an SH2 domain and a phosphotyrosine domain. 
1 o The interactions effected by the binding of target protein to a small molecule 

agent occur at the protein level, although the systems employed in the invention 
methods may be at the nucleic acid level which in turn effects generation of the fusion 
proteins. It is possible to conduct the assay entirely at the protein level, provided an 
extracellular system is employed. In such an in vitro system, fusion proteins, perhaps 
1 5 produced recombinant!^ are used as simple reagents, the agent-containing complex is 
added, and the proximity of the portions of the segregable protein is assessed using an 
appropriate assay. This is illustrated by the use of fusion proteins containing 
complementary portions of P-galactosidase; when brought into proximity, the active 
enzyme can be detected by an appropriate enzyme activity assay. Thus, in such an in 
2 0 vitro system, a fusion protein containing one complementary portion of 

p-galactosidase is fused to a target protein; a second portion of the p-galactosidase is 
fused to a iigand-binding domain; and the agent to be tested, coupled with a ligand 
which binds to the ligand-binding domain is added to a reaction chamber containing the 
two fusion proteins. Successful binding of agent to target is detected by 

2 5 p-galactosidase activity. 

More commonly, however, the method of the invention is conducted 
intracellular^ and the fusion proteins are generated from recombinant materials 
introduced into the cell. The cell is also modified, if necessary, to provide a reporter 
system which will permit assessment of the proximity of the complementary portions of 

3 0 the segregable active protein. If the active protein is a transcription activator, 
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generally the reporter system comprises a reporter gene operably linked to a promoter 
which is induced by the transcription factor. If the resultant of the interaction of the 
complementary portions of the active protein is generation of a signal, alternative 
assays can be used to detect the presence of the signal. The signal may itself result in 
transcription of an inducible gene either native to the cell or introduced. The effect 
need not be direct. For example, an interaction which stimulates production of a 
second messenger can lead to a detectable signal as a consequence of the signal 
transduction cascade induced by the second messenger, Further, the detectable signal 
does not necessarily require new gene transcription. For example, the generated signal 
may lead to translocation of an immunologically detectable protein from the interior of 
the cell to the surface. 

This intracellular embodiment utilizes expression units for the fusion protein 
and, if needed, an expression unit for a reporter gene. The particulars of the 
intracellular embodiment can perhaps best be illustrated by a detailed description of the 
application of the yeast two-hybrid system to the invention method as follows. 

The two-hybrid method for detecting protein-protein interaction is disclosed by 
Fields et al U.S. Patent No. 5,283,173 (incorporated herein by reference). 
Specifically, the Field system provides plasmids that express two fusion (or hybrid) 
proteins in a host cell. The first plasmid comprises a nucleotide sequence that encodes 
either the DNA-binding domain of a transcription activator or the RNA activation 
domain of a transcription activator fused to a first peptide encoding sequence and the 
second plasmid comprises a nucleotide sequence that encodes either the DNA-binding 
domain of a transcription activator protein or the RNA polymerase activation domain 
of the transcription activator (whichever is not present in the first plasmid) fused to a 
second peptide encoding sequence. A third expression unit in the same host contains 
one or more detectable markers whose expression is controlled by a promoter that is 
activated by the associated two transcription activator domains contained in the fusion 
proteins. Interaction of the first and second proteins encoded by the plasmids causes 
the DNA binding domain and the activation domain to become closely associated, thus 
reconstituting the functionality of the transcription activator protein allowing the 
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detectable marker gene to be expressed. See also Figure 1 and Bartel, P.L. et al y 
Nature Genet (1996) 12(l):72-77. This system has been used to study the interaction 
of short peptides with a target protein, yielding quantitative results proportional to the 
known affinity constants over a studied range of 10-100 jxM (Yang, M. et al y Nucl 
5 Acids Res (1995) 23(7):1 152-1 156). It has also been applied to the extracellular 
domains of receptors, expressed intracellular^ (Young, K.H. etal, PCT application 
WO 95/34646). 

The system of Fields could theoretically be extended to small molecules by 
replacing one of the fusion proteins with a small molecule covalently attached either to 
1 0 the RN A polymerase activation domain or the DN A binding domain of the 

transcription activator protein. The resulting domain/small molecule fusion would then 
need to be introduced into a cell in order to test whether the small molecule interacts 
with a target or test peptide/protein. Since permeabilization of a ceil to the protein 
transcription activator domain will be difficult, such a method is not generally 
1 5 workable; further, coupling the small molecule to a protein is less convenient than 
producing a fusion protein. 

The present invention offers a more efficient solution for extending the two- 
hybrid system to the study of small molecule/peptide(protein) interactions and solves 
the problems that are associated with protein permeabilization. By taking advantage of 
2 0 the interaction of a ligand with a proteinaceous ligand binding domain contained in one 
of the fusion proteins, the molecule can be introduced into the cell as part of a complex 
with a simple ligand. The cell will more readily be permeable to such a complex. For 
example, the interaction of biotin with a biotin-binding domain such as that of avidin 
could be used. 

2 5 The methods and cells of this embodiment use four elements, of which three 

are nucleic acids: 1) a target peptide expression unit; 2) a proteinaceous ligand binding 
domain expression unit; and 3) a detectable marker expression unit. The fourth 
element is an agent/ligand complex. 

As used herein, a "expression unit" refers to a nucleic acid molecule that 

3 0 contains a coding nucleotide sequence that can be transcribed and translated in a host 



WO 98/16835 



PCTVUS97/17975 



cell. Expression units are comprised of an expression control element, for example, an 
inducible or constitutive promoter, that is operably linked to a protein-encoding 
sequence. The choice of the expression control element will depend primarily on the 
host cell employed. A skilled artisan can readily employ any art-known expression 
5 control sequence for use in the expression units of the present invention. 

In this embodiment, both the target peptide expression unit and.the 
proteinaceous ligand-binding domain expression unit employ sequences encoding 
fusion proteins which contain complementary portions of a transcription activator 
protein. The transcription activator protein has at least two separable portions, each 

1 0 needing to be present to have an active transcription activator protein. The preferred 
transcription activator will have a DNA binding domain peptide and RNA polymerase 
activation domain peptide. Transcription activator proteins that have separate DNA 
binding and activation domains are known in the art for organisms such as yeast and 
include, but are not limited to, the yeast Gal4, GNC4, ADR1, Hapl, Swi5, Stel2, 

15 Mcml, Yapl, Acel, Pprl, Arg81, Lac9, Qalf, VP 16, and LexA proteins, non- 
mammalian nuclear receptors, such as ecdysone, and mammalian nuclear receptors 
such as the estrogen, androgen, glucocorticoids, mineralocorticoids, retinoic acid and 
progesterone receptors. The choice of the transcription activator protein used will 
depend primarily on the host cell chosen. The preferred transcription activators will be 

2 0 active in yeast, the preferred being the yeast Gal4, GNC4 and. ADR1 transcription 
activator proteins. 

Thus, the "target peptide expression unit" will encode a fusion protein 
containing a "target peptide" and one of the domains described above which is a 
portion of a transcription activator protein. As used herein, a "target peptide" can be 

25 any peptide or protein of two or more amino acids in length. The terms "peptide" and 
"protein" are used interchangeably in the present application and do not contain 
implications as to the length of the amino acid sequence. The target peptide may be a 
known peptide or protein of known biological function, or may be an isolated protein 
whose biological function is not known. The target protein may also be contained in a 

30 library of peptides of differing amino acid sequences. When the invention methods are 
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used to identify peptides that bind to a particular agent, then the target peptide will 
preferably be contained in a mixture of peptides with differing amino acid sequences, 
as in an expression library or a combinatorial peptide library. When the method is used 
to identify agents that bind to a particular target peptide, the preferred target peptide is 
5 either the entire protein, such as a receptor or a fragment of the receptor that contains 
a particular domain, for example, a fragment that contains the active site of an enzyme. 

The second element is an "proteinaceous ligand binding domain expression 
unit". As used herein, a "proteinaceous ligand binding domain expression unit " or 
"ligand binding domain expression unit, is defined as a nucleic acid molecule that 
1 0 comprises a nucleotide sequence that encodes a fusion protein. The fusion protein 

contains "proteinaceous ligand binding domain" and a second portion of a transcription 
activator protein, either the RNA polymerase activation domain or the DNA binding 
domain, whichever is not encoded in the target peptide expression unit; namely if the 
target peptide expression unit contains a nucleotide sequence encoding the DNA 
1 5 binding domain, then the ligand binding domain expression unit will contain a 

nucleotide sequence encoding the RNA polymerase activation domain and vice-versa. 

As used herein, a "proteinaceous ligand binding domain" is defined as a peptide 
that binds to a particular "ligand." As such, the "proteinaceous ligand binding domain" 
is a paired compound to the "ligand" employed. Examples of paired proteinaceous 
2 0 ligand binding domain and ligand compounds include, but are not limited to, a biotin 
binding domain/biotin pair, the FLAG peptide detection system pair, an 
antibody/hapten pair, a carbohydrate binding lectin/complementary carbohydrate pair, 
drug/protein pair (e.g., cyclosporine to cyclophilin or FK506 to FKBP) and single- 
chain Fv antibody fragment/agent ligand complex pair. The preferred pair is biotin 

2 5 binding domain/biotin. A skilled artisan can readily adapt any known proteinaceous 

binding domain/ligand pair for use in the present methods. 

As used herein, a "biotin-binding domain" is defined as a peptide sequence that 
binds to biotin. The preferred biotin-binding domains are found in avidin and 
streptavidin (Hiller et al., Biochem J (1991) 278:573-585. Alternatively, biotin binding 

3 0 partners isolated from a random peptide library, such as those described by Saggio ei 
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al Biochem J (1993) 293:613-616 or those identified using the present methods, can 
be used. 

The third element is a "detectable marker expression unit." As used herein, a 
"detectable marker expression unit" comprises a nucleic acid molecule that encodes 
5 one or more detectable markers whose expression is controlled by a promoter that 
requires the binding of a transcription activator protein for transcription to occur. 
Promoters that require the binding of a transcription activator are well known in the 
art. These include, but are not limited to, the promoters bound by the yeast Gal4, 
GNC4 and ADR1 proteins. The nucleotide sequence of the promoter employed will 

1 0 depend on the transcription activator protein chosen for the fusion proteins described 
above. The promoter is chosen such that transcription occurs when the two portions 
of the activator that are present in the fusion proteins become associated with each 
other and bind to the promoter. 

As used herein, a nucleotide sequence is said to encode a detectable marker 

1 5 when, upon expression, the expressed nucleotide sequence produces a feature that can 
be detected. For example, 1) detection can be via complementation of nutritional 
auxotrophy, where the detectable marker complements a mutation found within the 
cell, such as a gene complementing a mutation in the biosynthetic pathways of amino 
acids, such as the His, Leu, Arg, Met, Lys and Trp pathways; 2) detection can be 

2 0 based on the production of an identifiable or assayable signal, such as p-galactosidase 
or green fluorescent protein (Atkins et al Curr Genet (1995) 28:585-588); 
3) detection can be by cell death, such as with the use of a toxic pro-drug; 4) detection 
can be based on the resistance to a normally toxic agent such as an antibiotic e.g., by 
the methods of Rotman, U.S. Patent No. 5,472,846; 5) detection can be based on 

2 5 genes conferring sensitivity to a chemical, such as the CYH2 or CAN1 protein; 

6) detection can be based on the production of an agent that is readily detectable 
through assay means such as antibody binding or can readily be separated with a 
antibody or magnetically labeled probe; or 7) the reporter gene may provide a protein 
which migrates to the surface of the cell in which it is produced, permitting the 

3 0 separation of the cells expressing the reporter gene using affinity chromatography, for 
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example, with respect to the cell surface reporter protein. A skilled artisan can readily 
adapt any one of the available detection/marker systems known in the art for use with 
the methods and cells of the present invention. 

In one application of the present invention, the detection system is chosen such 
5 that cells can be screened based on the transient expression of the detection system 
employed. Such systems typically use a fluorescent marker protein, such as the green 
fluorescent protein, that can be identified without waiting for any significant cell 
growth to occur. A transient system employing a fluorescent marker allows the use of 
a fluorescent activated cell sorter that can identify a single fluorescent cell in a 
1 0 population of as many as 1 0 6 cells. 

The above three nucleic acid elements can be present as isolated nucleic acid 
molecules or can be present in one or more vectors. As used herein, "vector" is 
defined as a nucleic acid molecule that can autonomously replicate within a host cell. 
Vectors based on episomal elements such as plasmids, and vectors based on viral or 
1 5 chromosomal origins of replication, are well known in the art and can readily be 
modified to contain one or more of the nucleic acid elements used in the present 
invention. In one application, the three nucleic acid elements are contained on separate 
vectors. In another application, all of the elements are present on a single vector. In a 
third application, one or more of the elements are integrated into the chromosome of a 
2 0 host cell. 

The fourth element used in the illustrated method is an "agent" that is typically 
other than a protein that has been covalently attached to a ligand to form an 
agent/ligand complex. The agent can be any substance, including peptides, small 
molecules, vitamin derivatives, and carbohydrates. The preferred agents are non- 

2 5 protein, small molecule agents. A skilled artisan can readily recognize that there is no 

limit as to the structural nature of the agents assayed by the present methods. 

All of the invention embodiments utilize a ligand/agent complex. A variety of 
techniques known in the art can be used to attach a ligand to the agent. The method 
employed will depend primarily on the agent and ligand chosen. Such methods 

3 0 include, but are not limited to, direct chemical linking of the ligand to the agent or 
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using a linker to attach the ligand to the test agent, such as the use of a photoactivated 
biotin sold by Pierce Chemical (Rockford, EL). A skilled artisan can readily adapt any 
of the presently available methods for attaching a ligand, such as biotin, to agents for 
use in the present invention. 
5 The elements described above are used in conjunction with a recombinant host 

cell that has been or can be transformed to contain the nucleic acid elements and the 
agent/ligand complex. As used herein, a "recombinant host cell" is defined as any cell 
that can be genetically altered so as to contain introduced nucleic acid molecules. The 
term "cell," "cells," "cell cultures," and the like are used interchangeably and can 

1 0 indicate a single cell or a plurality of cells as the context indicates. Both procaryotic 
and eucaryotic cells can be used; yeast cells are preferred, most preferably 
Saccharomyces cerevisiae. 

Accordingly, embodiments conducted in intracellular environments provide 
cells transformed or modified to contain a ligand binding domain expression unit, and 

1 5 optionally a detectable marker expression unit as herein defined. Such cells will further 
be modified to contain a target peptide expression unit. The expression units 
contained in the hosts of this embodiment can be either integrated into the host 
chromosome or can be present within the host cell in the form of an episomal unit. A 
skilled artisan can readily use art-known methods to generate the host cells of the 

2 0 present invention. 

To perform the method of the present invention, the test agent/ligand complex 
is introduced into a cell that contains a target peptide expression unit, a ligand-binding 
domain expression unit, and, if required, a detectable marker expression unit. A 
variety of techniques are presently known in the art for introducing small molecules 

2 5 and/or DNA into cells, and in particular yeast cells. Such methods include, but are not 
limited to, electroporation, lipofection, liposome delivery, transient saponification, 
natural uptake means, Penetratin-1 (a 1 6-amino-acid peptide known in the art as a cell 
permeation vehicle commercially available from Oncor (MD)), and chemical 
transformation, such as using lithium acetate. 
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The agent/ligand complex and expression vectors can be introduced into the 
host cell at the same time. Alternatively, one or more of the expression units can be 
introduced into the host cell prior to the introduction of the agent/ligand complex. In 
such an use, the expression units can further be integrated into the host chromosome 
5 or can be present as episomal units. 

After the agent/ligand complex is introduced into the cell, the cell is incubated 
under conditions in which the target peptide expression unit and the ligand-binding 
domain expression unit are expressed. The agent/ligand complex will become 
associated with the ligand-binding domain expression unit due to the normal 
1 0 interaction of the ligand and the ligand-binding domain. If the agent and target peptide 
interact, the two portions of, for example, the transcription activator protein will come 
into close proximity, allowing the transcription activator protein to bind to the 
promoter of the detectable marker expression unit. If the agent/ligand complex does 
not bind to the target peptide, then the two portions of the transcription activator 
1 5 protein will not come into close proximity and the detectable marker expression unit 
will not be expressed. 

The incubation conditions for the host cell will vary and will depend on the 
organism utilized, the detectable marker chosen, and the expression control elements 
(e.g. inducible or constitutive promoters) used in the target peptide expression unit and 
2 0 ligand-binding domain expression unit. A skilled artisan can readily determine the 

appropriate conditions need to achieve expression of the target peptide expression unit 
and ligand-binding domain expression unit based on the specific elements employed. 

One obstacle to the use of yeast as a host organism is that yeast possesses 
molecular pumps that are efficient in removing xenobiotics from within the cell. Some 
2 5 of this activity has been shown to be associated with a pump related to the mammalian 
MDR pump. Activity of this pump should be largely irrelevant in the present method 
because of the high affinity of most ligand binding domains for their cognate ligand. 
However, to increase the efficiency of the present method when yeast cells are used as 
a host cell, the yeast cells can contain a mutation that reduces the activity of the MDR 
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pump. One such mutant presently known in the art is the sterile-6 mutant of 
Saccharornyces cerevisiae. 

Although the foregoing description focuses on improvements in the two-hybrid 
system described by Fields, it will be apparent that the method of the invention is not 
5 limited to this embodiment. When the method of the invention is performed in an 
intracellular context, the cell used in the method will contain both a target peptide 
expression unit and a proteinaceous ligand-binding domain expression unit as defined 
in accord with the yeast two-hybrid system, except that the complementary portions of 
a segregable active protein or complex need not be portions of a transcription factor, 

1 0 but can be any two complementary portions which result in a detectable activity when 
brought into close proximity. Additional examples of such complementary portions 
have been described above. Depending on the cell used in the assay, it may or may not 
be necessary to include a separate detectable marker expression unit as defined above. 
The cell may inherently contain a mechanism for detecting the activity of the proximal 

1 5 protein components. Regardless of the choice for the complementary portions, 

however, the small molecule candidate binding agent will be supplied in the form of a 
complex with a ligand which has a strong affinity for the amino acid sequence that 
represents the ligand-binding domain. Thus, methods of the invention performed 
intracellular^ involve cells that contain, at a minimum, expression units for the two 

2 0 relevant fusion proteins and optionally, when necessary, a detectable marker 

expression unit. The embodiments described above with respect to the nature of the 
ligand and ligand-binding domains, the types of markers possible, etc., are applicable in 
these embodiments as well. 

The cells and methods of the invention are readily adapted for use in a 96-well 

2 5 plate format. Such a format allows for the screening of a large number of agents or 
target peptides, particularly the products of combinatorial enzymology, for example 
see McDaniel, et aL, Science 252:1546-1550(1993). 

In one example of such a use, an agent that has a biological activity but whose 
protein target is unknown is used to identify the relevant target protein. In such an 

30 application, a peptide library comprised of a cDNA library, is used as the source for the 
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peptide encoding sequences used in the target peptide expression unit. The collection 
of peptide encoding sequences thus results in a multiplicity of target peptide expression 
units. A population of host cells is transformed with the mixture such that each host 
cell, or a large proportion of the host cells, contain a different amino acid sequence for 
5 the target peptide. 

An agent/ligand complex is then introduced into the transformed population of 
host cells. Host cells containing peptide sequences to which the agent binds can be 
detected via the detection system employed. If it is selected to use a fluorescent 
activated cell sorter and a fluorescent detection method, thousands of individual cells 
1 0 can be rapidly screened. 

Cells identified as containing a peptide that binds the test agent are isolated and 
the nucleic acid encoding the target peptide examined using art-known DNA 
sequencing methods. The target peptide thus identified can then be 1) identified as the 
biological target of the agent, 2) used in a competitive binding format to identify other 
1 5 agents that may bind with greater/lesser selectivity, affinity, or avidity, to the target 
peptide, 3) used as a diagnostic agent for use in binding assays to determine the 
presence of the agent in a sample, or 4) used as an affinity ligand for the agent. The 
methods of the present invention are particularly useful in identifying diagnostic agents, 
such as glubodies (described in U.S. Serial No. 08/380,188 incorporated herein by 
2 0 reference), that can be used in environmental monitoring, such as in assaying for the 
presence of a particular pesticide in a sample. 

In addition to identifying a target receptor, information regarding the nature 
and sequences of peptides bound by a particular agent can be useful in assessing and 
predicting the toxicity of a given agent. Specifically, using a cDNA library as a source 
25 for the target peptide sequences, the method of the present invention will not only 

identify the biological target of a particular agent, but will also identify other molecules 
to which the agent binds. Such "runner-up" target peptides that are bound by the 
agent can be used to assess the toxicity and selectivity of a given agent. 

In still another use, a known target, such as a receptor, can be used to screen a 
3 0 library of small molecules. Extensive libraries of compounds are available for testing 
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or can be synthesized with the ligand incorporated, as in combinatorial chemistry 
methods; in this case, the ligand can be attached to aU compounds as the first step of 
their synthesis, if desired. In this application, the host cell will contain, in the target 
peptide expression unit, a nucleotide sequence encoding the relevant receptor, and the 
5 agent/ligand complex will comprise a member of the candidate library covalently bound 
to the chosen ligand. The determination of the interaction of the agent with the target 
peptide is conducted as herein described. The agent/ligand complex can be supplied as 
part of a mixture of complexes containing a variety of agents, which can then be 
separated by dilution of the resultant transformed cells, or can be supplied as an 

1 0 individual member of a panel. 

The reagents, cells and methods of the invention are also useful in obtaining 
profiles of agents against protein panels as described in allowed U.S. Serial No. 
08/1 77,673. In this use, a panel of cells containing the three expression units of the 
invention is provided, wherein each member of the panel contains a different target 

15 peptide expression unit. Each desired agent/ligand complex is then tested against each 
member of the panel to obtain the desired profile. 

The following examples are intended to illustrate but not to limit the invention. 

2 0 Preparation A 

Preparation of Derivatized Agent 
The agents to be tested for binding to protein are coupled to commercially 
available biotin analogs (Pierce Chemical) that incorporate a spacer arm optimized for 
use in affinity binding experiments and include various reactive groups. Such reactive 

2 5 groups may include a photoactivated free radical conjugating system, which yields a 

collection of conjugates with the biotin attached at different places. When this system 
is used at low stoichiometry, the probability that some of the conjugates will have the 
binding properties of the agent to be tested is high, and the relatively large size of most 
natural product agents makes them particularly suitable for this kind of derivatization. 

3 0 Alternatively, the reactive groups are less promiscuously reactive, based on known 
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selectivities of certain radical generating moieties (Barton, D.H. et al % J Am Chem Soc 
(1961) 83:4083-4089), or widely used moieties such as N-hydroxysuccinimide for 
reaction with amines. 

The biotin-derivatized agents are isolated on a commercially available avidin 
5 affinity sorbent (Sigma), and separated from free biotin by HPLC. Competitive 

binding assays confirm that the derivatized agents still bind the same site: following 
separation of bound and free agents by size exclusion filtration (Sarstadt centrifuge 
filtration devices), HPLC of the native agent determines the proportion bound to target 
in the presence of varying molar ratios of derivatized agent. The derivatized agent may 
1 0 also be analyzed by NMR and/or elemental analysis. 

Preparation B 

In Vitro Assay of Affinity Sorbent Containing Derivatized Agent 
The derivatized agent is captured on commercially available agarose beads 
1 5 conjugated with avidin (Sigma) to obtain affinity sorbent for the target. The purified 
protein target of the agent is applied to the sorbent and the binding capacity for target 
is measured by comparing protein content of the material applied to the sorbent with 
that of the unretained material. As a control, the bound proteins can also be eluted 
under strongly denaturing conditions, separated by SDS gel electrophoresis, blotted to 
20 a membrane, and probed with commercially available antibodies specific for the known 
target proteins. 

The specificity of an affinity sorbent is also evaluated by contacting it with 
tissue extracts known to contain the target proteins and eluting bound proteins. The 
eluted proteins are analyzed by electrophoresis followed by silver staining and/or by 
2 5 staining membrane-transferred material with appropriate antibodies. 

The foregoing Preparations A and B describe methods for preparing materials 
for use in assessment of binding targets for agents to be tested and vice versa as well 
as for controls and for characterization of already identified interactions. 
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Example 1 
2-Hvbrid Assay 

A cDNA library in yeast is the source of fusion proteins comprising the DNA- 
binding domain of the yeast Gal4 transcription factor and the expression product of a 
5 cDNA insert. Vectors containing these inserts are prepared as described by Fields, ef 
. al in U.S. Patent No. 5,283,173 cited hereinabove. The biotin binding domain 
expression unit employs a minimal fragment of streptavidin with superior binding 
ability for biotin conjugates (Sano, T. et aL, J Biol Chem (1995) 270(47):28204- 
28209). 

1 0 The nucleotide sequence encoding the streptavidin fragment is ligated in 

reading frame with the RNA polymerase activating domain of Gal4 in the appropriate 
vector. The biotin-derivatized test agent of Preparation A is introduced into transiently 
permeabilized yeast cells containing the expression systems for cDN A encoding target 
and biotin binding domain and for the reporter gene f3-galactosidase using the method 

15 of Gift, E.A. et aL, Biochem Biophys Acta (1995) 1234:52-62. Due to the high 
affinity of avidin for biotin, the test agents are well retained in the yeast cells. 
Expression of reporter is quantitated with -the chromogenic substrate ONPG, 
O-dinitrophenyl-P-D-galactopyranoside (Sigma). 

As controls, target protein expression systems employing sequences encoding 

2 0 proteins with known small molecule binding agents, for example cyclophilin or Protein 
Kinase C, are also included in the host yeast cell. As a further control, these vectors 
are spiked at decreasing doses into the cDNA library, and the recovery rate of the 
spiked clones indicates the noise level of the system. 

An additional control is the sorbent binding assay described in preparation B. 

2 5 Alternatively, the substrate for the determination of J3-galactosidase levels 

generates a fluorescent product, in which case the assay for expression can be 
performed by a fluorescence-activated cell sorting (FACS). Similarly, the use of the 
fluorescent protein GFP (Cubitt, A.B. eta!., Trends Biochem Sci (1995) 20:448-455) 
permits the use of FACS to assess expression. 



30 
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Claims 



1 . A method to determine whether an agent and a target protein interact, 
said method comprising the steps of: 
5 contacting an agent/ligand complex consisting essentially of an agent to be 

tested for binding to a target protein coupled to a ligand capable of binding a 
proteinaceous ligand-binding domain with 

a first fusion protein comprising said target protein and a first complementary 
portion of a segregable protein; and 
10 a second fusion protein comprising a proteinaceous ligand-binding domain and 

a second complementary portion of said segregable protein; and 

detecting whether the first complementary portion and second complementary 
portion are brought into proximity. 

15 2. The method of claim 1 wherein the first complementary portion and 

second complementary portion of the segregable protein are the cytoplasmic domains 
of a signaling receptor. 

3 . The method of claim 2 wherein the signaling receptor is a T lymphocyte 
2 0 antigen receptor. 

4. The method of claim 1 wherein the first and second complementary 
portions of the segregable protein are subunits of a dimer of Raf-1 serine/threonine 
kinase. 



25 



5. The method of claim 1 wherein the first and second complementary 
portions of the segregable protein are the Raf-1 serine/threonine kinase and the 
cytoplasmic domain of the human interferon y receptor (H Y R). 
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6. The method of claim 1 wherein the first and second complementary 
portions of the segregable protein are portions of 0-galactosidase. 

7. The method of claim 1 wherein the first and second complementary 
5 portions of the segregable protein are portions of a transcription activator. 

8. The method of claim 1 wherein the target peptide is a known receptor 
and the agent is a member of a library of small molecules. 

0 9. The method of claim 1 wherein the agent is a substance of known 

biological activity and the target peptide is a member of a library of target peptides. 



10. The method of claim 1 wherein the target peptide is a member of a 
panel of target peptides. 

15 

11. A recombinant host cell modified to contain: 

i) a target peptide expression unit which comprises a nucleotide sequence 
encoding a fusion protein comprising a target peptide and a first complementary 
portion of a segregable protein; and 
20 ii) a ligand binding domain expression unit which comprises a nucleotide 

sequence encoding a fusion protein comprising a ligand binding domain and a second 
complementary portion of said segregable active protein. 

12. The host ceil of claim 1 1 which is further modified to contain a 
"25 detectable marker expression unit which comprises an inducible promoter operably 

linked to a nucleotide sequence encoding a detectable marker, wherein the expression 
of said detectable maker is regulated by said inducible promoter responsive to the 
association of said first and second complementary portions. 
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13 . The host cell of claim 12 wherein the first and second complementary 
portions are portions of a transcription activator, one of said portions consisting of the 
DNA binding domain and the second portion consisting of the RN A polymerase 
activation domain. 

5 

14. The host cell of claim 1 1 wherein the first complementary portion and 
second complementary portion are cytoplasmic regions of a signaling receptor. 

15. The host cell of claim 1 1 which is further modified to contain a small 

1 0 molecule agent covalently attached to a ligand, wherein said ligand binds to said ligand 
binding domain. 

16. The host cell of claim 1 5 wherein said ligand is biotin and said ligand 
binding domain is a biotin binding peptide. 

15 

17. The host cell of claim 1 1 wherein said host cell is a yeast cell. 

18. The host cell of claim 1 7 wherein said yeast cell contains a mutation 
that reduces the activity of the MDR pump. 

20 

19. The host cell of claim 1 8 wherein said mutation is the sterile-6 
mutation. 

20. The host cell of claim 1 1 , wherein the expression units (i) and (ii) and 
25 the agent/ligand complex are introduced into said host cell using a method selected 

from the group consisting of lipofection, electroporation, permeabilization using 
Penetratin-1, and chemical transformation with lithium acetate. 



WO 98/16835 



PCT/US97/17975 



-23 - 

21 . The host cell of claim 1 1 wherein the expression of said target 
expression unit and said ligand binding domain expression unit are controlled by 
constitutive promoters. 

22. The host cell of claim 1 3 wherein said detectable marker is selected 
from the group consisting of beta-galactosidase, a gene encoding a protein that 
complements a nutritional auxotrophy, green fluorescent protein, and a selectable 
marker. 

23. The host cell of claim 22 wherein said DNA binding domain and 
activation domain are subunits of a protein selected from the group consisting of the 
Gal4, GCN4 and ADR1 proteins. 

24. A population of the host cells of claim 1 1 wherein said population 
comprises at least two host cells that express target peptides with different amino acid 
sequences. 

25. The population of claim 24 wherein the target peptides are products of 
expression of a cDNA library. 

26. A method to determine whether an agent and a target peptide interact, 
said method comprising the steps of: 

a) incubating a recombinant host cell which contains 

i) a detectable marker expression unit which comprises an 
inducible promoter operably linked to a nucleotide sequence encoding a detectable 
marker, wherein the expression of said detectable maker is regulated by said inducible 
promoter, 

ii) a target peptide expression unit which comprises a nucleotide 
sequence encoding a fusion protein comprising a target peptide and a first portion of a 
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transcription activator protein selected from the group consisting of a DNA binding 
domain and an RNA polymerase activation domain; 

iii) a ligand binding domain expression unit which comprises a 
nucleotide sequence encoding a fusion protein comprising a ligand binding domain and 
a second portion of a transcription activator protein selected from the group eonuftng 
of a DNA binding domain and an RNA polymerase activation domain, whichever ,s not 

employed in (ii); and 

iv) a ligand complex wherein said ligand binds to said ligand 

binding domain; 

under conditions in which said expression units (ii) and (iii) are expressed; 

b) detecting whether said detectable marker is expressed; and 

c) determining the level of binding of said agent to said target peptide by 
the level of expression of said detectable marker. 

1 5 27. The method of claim 26 wherein said ligand binding domain is a biotin 

binding domain and the ligand of said agentfligand complex is biotin. 



0 



28. 



The method of claim 26 wherein said host cell is a yeast cell. 



20 29. The method of claim 28 wherein said yeast contains a mutation that 

reduces the activity of the MDR pump. 

30. The method of claim 29 wherein said mutation is the sterile-6 mutation. 



25 



31. The method of claim 26, wherein the expression units and the test 
agentAigand complex are introduced into said host cell using a method selected from 
the group consisting of lipofection, electroporation and permeabilization using 
Penetratin-l. 
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32. The method of claim 26 wherein the expression units (a)(i), (a)(ii) and 
(a)(iii) are introduced into said host prior to the introduction of the agent/ligand 
complex (a)(iv). 

5 33. The method of claim 26 wherein the expression of said target peptide 

expression unit and said ligand binding domain expression unit are controlled by 
constitutive promoters. 

34. The method of claim 26 wherein said detectable marker is selected from 
1 0 the group consisting of beta-galactosidase, a gene encoding a protein which 

complements a nutritional auxotrophy, and a selectable marker. 

35. The method of claim 26 wherein said DNA binding domain and said 
activation domain are subunits of a protein selected from the group consisting of the 

1 5 Gal4, GCN4 and ADR1 proteins. 



36. The method of claim 26 wherein the target peptide is a known receptor 
and the agent is a member of a library of small molecules. 

20 37. The method of claim 26 wherein the agent is a substance of known 

biological activity and the target peptide is a member of a library of target peptides. 

38. The method of claim 26 wherein the target peptide is a member of a 
panel of target peptides. 
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